home *** CD-ROM | disk | FTP | other *** search
- "filter", v. 2.1 Latest mod: 12:30 Jun 8 1994
- A `grep'-like text searcher for multiple simultaneous keyword tests
-
- Copyright 1994 by Joel Polowin, Department of Chemistry, Queen's University,
- Kingston, Ontario, Canada. Permission granted for free use and distribution;
- I want credit/blame for writing it. E-mail: polowin@silicon.chem.queensu.ca,
- polowinj@qucdn.queensu.ca, Joel.Polowin@p4.f107.n249.z1.fidonet.org .
-
- If you see something wrong with it or it fails to work, PLEASE let me know!
-
- Syntax: filter [filename] [filename ...] string [string ...]
-
- where each string (default max. of 2000) is a term to be searched for
- in lines (default max. 600 chars) in file(s) `filename', prefixed by one
- of the following characters:
-
- + to show lines which contain string
- - to show lines which do not contain string
- = to show lines which contain string, case sensitive
- _ (underscore) to show lines which do not contain string,
- case sensitive
-
- A string as above may be further prefixed with the letter 'o' to print the
- line if the current OR the preceding condition is true.
-
- A string including blanks and the prefix may be enclosed in double quotes
- on most systems. Your operating system may have other ways of dealing
- with special characters.
-
- "filter" determines the first string which is a search term instead of a
- file name by its beginning with one of the characters `+-=_'. For files
- whose names begin with one of these characters, see below. Otherwise, the
- first search term must begin with one of these, as that first term cannot
- be `or'-linked to a preceding term.
-
- For strings that begin with `$', `&', or `^' (usually names of files of search
- terms), see below.
-
- Examples:
-
- filter * +hawk +handsaw o+hound
-
- searches all files in the current directory and prints lines that
- contain the string `hawk' and at least one of `handsaw' or `hound'.
- This assumes that the operating system and compiler accept wild-carded
- file names; else "filter" will be looking for a file named `*'. For
- DOS, one would use `*.*'.
-
- filter armorial =Vert +argent -gules _Or -azure -purpur +foil > tempfile.txt
-
- searches the file `armorial' for lines that contain the string `Vert'
- (case-sensitive) and `argent' (upper or lower case) but not `gules'
- (upper or lower case) and not `Or' (case-sensitive) and not `azure'
- or `purpur' (upper or lower case) and DO contain `foil' (upper or
- lower case); the resulting lines are saved in file `tempfile.txt'.
-
- type temp1.txt | filter +aardvark "o+winged pig" o+wombat "_|B|"
-
- The file `temp1.txt' is fed through the "filter" program, which passes
- lines that contain `aardvark' (upper or lower case) or the string
- `winged pig' or `wombat' (upper or lower case) and do NOT contain
- `|B|'. The result is printed on the screen. Note use of quotation
- marks in the command line to include the space in `winged pig' and the
- special character `|' in `|B|'.
-
- ----------
- File names beginning with one of `+-=_'
-
- If you absolutely MUST use text file names that begin with one of these
- characters, use the character twice when specifying the file name to
- "filter". Thus, the file name `-stdev.c' would be written `--stdev.c';
- `++junk.c' would be written `++++junk.c'.
-
- What "filter" does is to go through each term in the command line and
- count the number of identical flag characters beginning each; that number
- is reduced by half, rounded down. An even number specifies a text file name;
- an odd number designates a search term. `+junk' has one flag character,
- is not changed (shortened by 1/2 -> 0 characters), and is a search term:
- print lines containing `junk'. `++junk' has two flag characters, is
- shortened to `+junk', and is read as a text file name. `+++junk' is shortened
- to `++junk', and is a search term: print lines containing the string `+junk'.
- `+=junk' has one flag character and is a search term: print lines containing
- the string `=junk'.
-
- This means that wild-carded file names that match files whose names begin
- with one of `+-=_' will cause trouble. I'm sorry; the telepathic monitors
- of most computer systems are not software-addressable. A compulsive urge
- to use files whose names begin with punctuation or mathematical symbols
- can now be treated successfully in a majority of cases.
-
- Search terms specified in files (see below) are not themselves in the
- command line, and if they begin with one of `+-=_' those characters should
- not be doubled. Search-term file expansion takes place after "filter"
- determines which command-line strings are file names.
-
- ----------
- `$',`&', and `^' usually flag search-term file names.
-
- A string which begins with `$', `&', or `^' will be expanded as the name of a
- file containing a list of search terms. For example, the string +$critters
- tells "filter" to look for a file `critters'; lines from that file are taken
- as search terms.
-
- If the file name is specified with `^', terms INCLUDING `+-=_' prefixes are
- read from the file. The string specifying the file name is replaced in the
- command line by the list of terms read from the file; the original prefix
- is ignored. "filter" does not add `+-=_' prefixes to the terms.
-
- If the file name is specified with `$' or `&', "filter" adds the prefix to
- each term, depending on which of `$&' is used and which of `+-=_' precedes it.
- With `$', `+' and `=' give or-linked terms, so that text file lines will be
- printed if any search-term file line is matched; `-' and `_' are not
- or-linked, so that text file lines are printed only if no search-term file
- line is matched. With `&', `+' and `=' are not or-linked, so that ALL
- search-term lines must be matched to print a text line; while `-' and `_'
- ARE or-linked, so that any search-term line NOT matched will allow text-line
- printing.
-
- To search for actual text strings beginning with `$', `&', or `^', double the
- flag characters. Thus, to search for the string `$100', use the search
- string `+$$100'. To expand file names beginning with those characters,
- use three of them: search-term file `$&junk' would be specified with something
- like `-$$$&junk'. In general, when a search term begins with a flag
- character, double each flag character of that kind beginning the term, and
- if the term is a file name, add an extra flag character.
-
- Search-term files may contain file names, which will be expanded in turn.
- For this reason, initial `$', `&', and `^' characters must be doubled even in
- nested search-term files.
-
- Note that the or-linking logic can get seriously messed up when terms
- beginning with `-' or `_' are expanded carelessly, as "filter" has no good
- sense of logical precedence. If file `human' contains `man' and `woman',
- then `o-$human' would expand as `o-man -woman'.
-
- Examples:
-
- filter armorial +$animal o+$vegetable _^mineral
-
- If file `animal' reads
- $human
- reptile
- amphibian
-
- and file `human' reads
- man
- woman
-
- and file `vegetable' reads
- tree
- grain
-
- and file `mineral' reads
- +rock
- -dirt
-
- then the above will be expanded to:
-
- filter armorial +man o+woman o+reptile o+amphibian o+tree o+grain +rock
- -dirt
-
- filter armorial +$&beastie +&&&doggie -$dragon
-
- If file `&beastie' reads
- unicorn
- $dragon
- manticore
-
- and file `&doggie' reads
- terrier
- hound
- $$paniel
-
- and file `dragon' reads
- wyvern
- dragon
- lizard
-
- then the above will be expanded to
-
- filter armorial +unicorn o+wyvern o+dragon o+lizard o+manticore +terrier
- +hound +$$paniel -wyvern -dragon -lizard
-
- which will print lines from file `armorial' that contain:
- any of: `unicorn', `wyvern', `dragon', `lizard', `manticore'; and
- ALL of: `terrier', `hound', `$paniel'; and
- none of: `wyvern', `dragon', `lizard'.
-
- Revision history:
-
- Version 1.0 September 1992.
-
- 1.1 Sep '92 fixed minor bugs
- 1.2 Sep '92 added 'or'-linking to keywords
- 1.4 Oct '92 fixed a minor error in string lengths, added size DEFINEs
- 1.5 Jan '94 increased string lengths, fixed a Stupid Newbie Error re:
- assumption that *argv[] was writable
- 2.0 Feb '94 added search-term file expansion and multiple text file
- capability, including wildcards when system permits
- 2.1 Jun '94 added null-prefix search-term files and term-file `rewind'
-